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i 1 ABSTRACT 

| We present a clustering analysis of 370 high-confidence Ha emitters (HAEs) at z — 2.23. 

r \ . The HAEs are detected in the Hi-Z Emission Line Survey (HiZELS), a large-area blank 

field 2.121/im narrowband survey using the United Kingdom Infrared Telescope (UKIRT) 
' Wide Field Camera (WFCAM). Averaging the two-point correlation function of HAEs in two 

~ 1 degree scale fields (United Kingdom Infrared Deep Sky Survey/Ultra Deep Survey [UDS] 
q ' and Cosmological Evolution Survey [COSMOS] fields) we find a clustering amplitude equiv- 

$— i \ alent to a correlation length of tq = 3.7 ± 0.3 h^ 1 Mpc for galaxies with star formation rates 

■ of >7A/ Q yr _1 . The data are well-fitted by the expected correlation function of Cold Dark 

Matter, scaled by a bias factor: whae = & 2 ^dm where b = 2.4^02- The corresponding 'char- 
acteristic' mass for the halos hosting HAEs is log(Al\ l /[h^ 1 Mo\) = 11.7 ± 0.1. Comparing 

7-H ' to the latest semi-analytic GALFORM predictions for the evolution of HAEs in a ACDM cos- 

J> . mology, we find broad agreement with the observations, with GALFORM predicting a HAE 

' correlation length of ~4/i Mpc. Motivated by this agreement, we exploit the simulations 

| to construct a parametric model of the halo occupation distribution (HOD) of HAEs, and use 

■ this to fit the observed clustering. Our best-fitting HOD can adequately reproduce the observed 

angular clustering of HAEs, yielding an effective halo mass and bias in agreement with that 

\Q , derived from the scaled ojdm fit, but with the relatively small sample size the current data pro- 

■ vide a poor constraint on the HOD. However, we argue that this approach provides interesting 
' hints into the nature of the relationship between star-forming galaxies and the matter field, 

including insights into the efficiency of star formation in massive halos. Our results support 
the broad picture that 'typical' (£>L*) star-forming galaxies have been hosted by dark matter 
haloes with Mh £ 10 12 /i _1 Af© since z rs 2, but with a broad occupation distribution and 
clustering that is likely to be a strong function of luminosity. 
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1 INTRODUCTION 

The Cold Dark Matter model contends that galaxies are biased 
tracers of an unseen, underlying cold dark matter distribution that 
has evolved from primordial fluctuations into a rich hierarchy of 
structure, with baryons forming into galaxies within gravitation- 
ally bound dark matter halos (White & Rees 1978). Understanding 
the relationship between the distribution of observed galaxies, their 
properties, and their co-evolution with the latent matter field is a 
key question of observational cosmology, and can yield important 
information about a galaxy population (Peebles 1980). 

* E-mail: jimgeach@physics.mcgill.ca 



One of the simplest, but also the most powerful, tools at our 
disposal to address this issue is the clustering of galaxies, as has 
been recognised for many years (Rubin 1954; Groth & Peebles 
1977; Peebles 1980). At a basic level, the statistics of counts of 
galaxy pairs, relative to random distributions, reveal the scales over 
which the fluctuations in the spatial distribution of galaxies are cor- 
related, and therefore a measure of how 'clustered' a population is; 
longer correlation lengths correspond to stronger clustering and an 
indication that those galaxies are hosted by, on average, more bi- 
ased and hence more massive dark matter halos (e.g. Mo & White 
1996). 

In the local Universe, mature wide-area surveys such as the 
Sloan Digital Sky Survey (SDSS; York et al. 2000) and the Two De- 
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gree Field (2dF) Redshift Survey (Colless et al. 2001), have deliv- 
ered highly accurate measurements of the clustering of populations 
of galaxies and quasars (Norberg et al. 2001; Myers et al. 2006; 
Ross et al. 2009; Wake et al. 2008; Zehavi et al. 201 1). A key result 
of these studies is the observation that the clustering amplitude is 
enhanced as the mass limit of the galaxy sample increases, indicat- 
ing that the more massive galaxies are hosted by more massive ha- 
los. Furthermore, it is clear that passive galaxies are more strongly 
clustered on small spatial scales compared to galaxies with ongo- 
ing star formation (e.g. Norberg et al. 2002). Over the past decade 
a method of interpreting these observations has been developed (in 
part motivated by large A'-body simulations) which expresses the 
distribution of galaxies relative to the matter field through a prob- 
abilistic halo occupation distribution (HOD; Benson et al. 2000; 
Cooray & Sheth 2002; Zheng et al. 2005) or, similarly, a condi- 
tional luminosity function (Yang et al. 2003). Halo models pro- 
vide an intuitive framework to relate observed projected correlation 
functions to the hierarchical paradigm, and are becoming increas- 
ingly common tools for the interpretation of clustering data. 

Clustering analyses are now routine for high redshift (z > 
1) mass-limited galaxy samples, largely thanks to the increased 
efficiency of deep and wide-area (~1 degree scale) multi-band 
(ultraviolet-optical-near-infrared) imaging surveys offering excel- 
lent photometric redshifts (accurate to the few percent level at 
z ~ 1) and stellar mass estimates for large numbers of massive 
galaxies (e.g. Wake et al. 2011). When it comes to measuring the 
clustering properties of purely star-forming galaxies at high red- 
shifts, which - in the halo model context - could yield impor- 
tant clues about the environmental trends in the history of stellar 
mass assembly, the main challenge is to understand the selection 
function, since most broad-band selections (Lyman Break, BX/BM, 
'sBzK', and so-on) can result in heterogeneous samples with broad 
redshift distributions, and can be biased towards stellar mass in 
complicated ways. The latter two issues are undesirable, given the 
strong evolution in the specific star formation rates (SFRs) of galax- 
ies since z ~ 1-2 (Noeske et al. 2007; Elbaz et al. 201 1). 

Narrowband (AA/A ~ 10 -3 ) selections of star-forming 
galaxies are of great value in this regard, as they allow for the 
clean selection of galaxies based simply on the strength of an emis- 
sion line sampled by the filter. The narrow bandpass corresponds 
to a narrow redshift window, within which the population is not 
expected to evolve. The main contaminants to such a survey are 
emission-line galaxies at different redshifts corresponding to the 
redshifting of alternative lines into the band. For high-z surveys 
these contaminants are predominantly lower-redshift populations 
and easily removed (see §2). Most narrowband- selected clustering 
analyses conducted so-far have targeted the Lya emission line, red- 
shifted into the optical window for z ~ 3 and thus convenient for 
deep, wide-field surveys out to very high redshifts (e.g. Ouchi et al. 
2003). The development of wide-format infrared cameras over the 
past decade has now cleared the way for panoramic near-infrared 
narrowband surveys that target the Ha nebular line at epochs of 
z ~1— 2, spanning the peak in the global star formation rate density, 
and thus one of the most important intervals in galaxy formation 
studies. Ha is favoured over the Lya line because of its (a) weaker 
dust obscuration (and ease of extinction correction, if the B aimer 
decrement is known), (b) better understood radiative transfer com- 
pared to the resonant Lya and (c) more accurate luminosity-to-star 
formation rate calibrations from surveys of local star forming re- 
gions. It is also important to measure the clustering of HAEs in 
preparation for the Euclid mission, as one of the probes used to 



constrain the nature of dark energy will be a slitless redshift survey 
of HAEs (Laureijs et al. 201 1). 

In this paper we present a clustering analysis of Ha emitters 
(HAEs) at z — 2.23 detected in our Hi-Z Emission Line (HiZELS) 
survey: a wide-field near-infrared narrowband survey selecting Ha 
emitting galaxies in three narrow 'slices' of redshift at z = 0.84, 
z = 1.47 and z = 2.23 (e.g. Geach et al 2008; Best et al. 2010; 
Sobral et al. 2010, 2012). In §2 we provide a brief review of the 
observations and selection technique (although we refer the reader 
to the aforementioned HiZELS publications for a complete, com- 
prehensive description); in §3 we describe the clustering analysis 
and present the results in §4, where we approach the interpreta- 
tion of the data with a series of models of increasing sophistication, 
from a simple power law fit to a full halo model. In §5 we discuss 
our findings and conclude with a review of the main results in §6. 
Throughout this work we quote magnitudes on the AB system, and 
assume a cosmology with Q m = 0.27, S7a = 0.73, as = 0.8 and 
H = lOO/ikms^Mpc -1 with h = 0.7. The co-moving distance 
to z = 2.23 is 5128 Mpc in this cosmology. 



2 NARROWBAND SELECTION OF Ha EMITTERS 

The observations and selection of HAEs in the primary HiZELS 
fields of the United Kingdom Infrared Deep Sky Survey (UKIDSS) 
Ultra Deep Survey (UDS; Lawrence et al. 2007) and Cosmological 
Evolution Survey (COSMOS; Scoville et al. 2007) are described in 
more detail by Sobral et al. (2012) - we refer the reader to that ar- 
ticle for a comprehensive overview of the selection technique, but 
in short we first select galaxies based on the significance of their 
'colour excess' in the narrow band. Corrections to the continuum 
slope over the bandpass of the AT-band filter (which could mimic a 
colour excess) is performed by interpolating over the neighbouring 
broad band (in this case, the //-band). Further broad band colour 
selections are performed to refine the selection (which can be con- 
taminated by lower redshift Paschen and Brackett lines for exam- 
ple). Here we perform a flux cut to obtain a catalogue of approxi- 
mately uniform depth across both UDS and COSMOS fields. 

The flux limit at which we are uniformly complete to 
>50% over both UDS and COSMOS fields is fn a = 
5 x 10~ 17 erg s _1 cm~ 2 , corresponding to a luminosity of 
log 10 (LHo,/erg s" 1 ) = 42.3 at z — 2.23. Note that variations 
in the exact depth of each WFCAM pointing (each field is a mosaic 
of several pointings) corresponds to a variation in the surface den- 
sity of galaxies. The impact of this on our measured clustering is 
in part absorbed into the error bars calculated by jackknife resam- 
pling of the survey area that we describe in §3.2. Assuming a Lu a - 
SFR calibration of 1.3 x 10 41 ergs -1 per Mq yr" 1 (Kennicutt et 
al. 1998), our selection is SFR limited at ^7 Mq yr -1 assuming a 
canonical 1 mag of extinction in the Ha line. Foreground sources 
are easily removed by high-quality photometric redshifts estimated 
from UV-optical-near-infrared photometry in both the UDS and 
COSMOS fields. Sobral et al. (2012) present the z phot distribution 
for /T-band selected HAEs, indicating the most significant peak in 
the distribution at z — 2.23, but with low-redshift enhancements 
at the expected wavelengths of Paa, Pa/3, He I, [S III], and at high 
redshift [Oni] at z ~ 3.3. 

To refine the photometric selection, we make use of a key de- 
sign feature of HiZELS, namely the fact that our custom-made J 
and //-band narrow-band filters select [O II] and [O III] emitters at 
z — 2.23 respectively. Thus, double or triple detections for the 
same source in each of the narrow-bands provides an extremely ro- 
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bust selection with almost no contamination. There are 84 z = 2.23 
HiZELS sources detected in this way, and this is used to refine 
photometric redshift cuts and broad-band photometric selections 
as described in further detail by Sobral et al. (2012). In summary, 
the overall contamination rate from non-HAEs in our sample is ex- 
pected to be $10%. 

The total number of galaxies detected in each field satisfying 
these selection criteria is 230 and 140 HAEs in COSMOS and UDS 
respectively. The higher number of HAEs in the COSMOS field is 
due to the difference in survey areas: HiZELS has so-far covered 
1.23 deg 2 in COSMOS and 0.75 deg 2 in UDS. Note that the surface 
density of HAEs measured in the two independent fields is nearly 
identical, Ehae~190 deg -2 . 



sive review of this and other error estimation methods). In short, 
the survey volume is split into TV sub-areas, and u}(9) calculated N 
times, each time excluding one of the sub-areas. The elements of 
the covariance matrix are then given by: 



N-l 
N 



- Wi)(w* - Uj)' 



(4) 



where is the correlation function (equation 1) measured for the 
ith angular bin, for the fcfh jackknife resampling, and 



1 N 



(5) 



3 CLUSTERING ANALYSIS 

3.1 Two-point angular correlation function estimator 

We calculate the two-point angular correlation function, lj(9), us- 
ing the estimator proposed by Landy & Szalay (1993), 



u(9) 



(N R \ 2 DD{9) N R DR{9) 

\N D ) RR(9) N D RR(9)' ( ' 



where Nd and Nr are the number of galaxies in the data and ran- 
dom catalogue respectively, and DD, RR and DR are the number 
of data-data, random-random and data-random pairs at angular sep- 
aration 9. The modified Poissonian uncertainty is: 



i + ujjd) 

VDD(9) ' 



(2) 



although this certainly is an underestimate of the true error (we esti- 
mate the full covariance matrix in §3.2). For the random catalogue, 
we distribute 20/Vd points uniformly over the survey areas, avoid- 
ing masked regions (cross-talk artifacts, bright stellar halos, etc.). 
We combine the results from the two independent survey volumes 
at the pair-counts stage, such that DD — DDtjds + £>Dcosmos, 
etc. In practice this gives very similar results to averaging the indi- 
vidual w(9), weighting by the Poisson uncertainty. 

A correction must be applied to w(ff) due to the finite area 
surveyed and the fact that the mean density of galaxies is estimated 
from the sample itself and would be biased due to cosmic variance. 
The integral constraint (C; Groth & Peebles 1977) corresponds to 
a scale-independent underestimation of uj(9). As in Geach et al. 
(2008), we calculate C following Roche et al. (1999): 



C 



J2 i uj(9 1 )RR(9 1 ) 



(3) 



where we model ui(0) using the scaled angular correlation function 
of dark matter, which is an excellent fit to the data and superior to 
a single power law (we discuss this analysis in §4.1). We evaluate 
equation 3 iteratively: first fitting the model to the data, calculating 
C and then applying this correction to the data and fitting again, re- 
peating this process until there is convergence. We find C = 0.134 
for the combined area, and correct the measured ui(9) for this factor 
before fitting models. 

3.2 Error estimation 

We estimate the full covariance using the 'delete one jackknife' 
method (Shao 1986, and see Norberg et al. 2009 for a comprehen- 



We split the survey volume into 32 sub-regions and evaluate 
equation 1 for each jackknife realisation, omitting one sub-region 
each time. The uncertainty on the correlation function evaluated at 
each angular bin is given by 5u>(9i) = ^JCa and this is used in the 
evaluation of \ 2 difference between the data (cj) and an arbitrary 
model (u; modcl ) taking into account covariance is 



2 / model \ T n — 1 / modeK ,r\ 

X = (u) - oj ) C (u - u ), (6) 

with the la uncertainty on a model parameter equivalent to the 
range A% 2 = 1. 



4 RESULTS 

We present the results in Figure 1, corrected for the integral con- 
straint, and including the covariance uncertainties evaluated in 
equation 4. Correlation functions are often fitted by a single power 
law, to(9) = A9~^ , usually with ft ~ 0.8. This is adequate to fit 
the overall trend in the data, but the observed correlation function 
clearly deviates from a simple power law, especially at 9 > 1'. 
In part, the deviation of the observed correlation function at large 
separations is due to the break-down of Limber's approximation at 
9 ^ 600" for samples where Az is narrow (Simon 2007, Sobral 
et al. 2010). In this case, even if the spatial correlation function 
is a power-law, the angular correlation function will depart from 
a power law at large angular separations. However, we also expect 
that a single power law is insufficient to model the clustering across 
the full angular range for physical reasons related to the relative 
clustering of satellite galaxies within single dark matter halos to 
the clustering of the halos themselves. 

We explore this in the following sections, however for now we 
start our analysis with the simple power-law model fitted to data 
at scales 9 < 600", which is useful for obtaining an estimate of 
the correlation length of the galaxies and easily comparable to the 
clustering of other populations. We perform minimised \ 2 fits for 
the amplitude of the correlation function, fixing the slope with /3 = 
0.8. We find a clustering amplitude A — 29 ± 4arcsec 0,8 , with a 
reduced X 2 /v = 0.9. Throughout, we quote la uncertainties on 
the \ 2 fit using the full covariance matrix calculated in equation 6. 

If the real space correlation function can be assumed to be 
£(r) = (r/ro) -7 , where ro is the real-space correlation length and 
7 = f3 + 1, the amplitude of the correlation function A can be 
related to ro using a version of Limber's equation (Limber 1954; 
Peebles 1980): 



A = r. 



, r(( 7 -i)/2)r( 7 /2) 
r(i/2) 



H z ( dn 



1-7 



dz, (7) 
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Figure 1. Two-point angular correlation function of HAEs in the COSMOS and UDS fields, (left) We show two model fits to the data: (a) a simple power 
law A6~ 8 (dashed line) and (b) the projected correlation function of dark matter, scaled by a bias parameter, b 2 (dotted and solid lines). The power law is a 
reasonable fit to the general shape of the HAE correlation function, but the dark matter model also provides a good fit, and captures the deviation from a simple 
power law at all scales. The error bars are calculated from the diagonal elements of the covariance matrix which was estimated from the jackknife re-sampling 
method (we show for comparison the Poisson errors as thicker bars). The correlation function for the individual fields is also shown, however for clarity we 
do not show the error bars for these. Note that the combined COSMOS+UDS u)(6) values have been corrected for the integral constraint (§3.1, equation 3), 
whereas the individual fields have not. (right) Combined correlation function as (left), but shown with the best fitting HOD model (described in §4.3). The 
halo model successfully models the amplitude of the clustering strength on all measured scales, including the break at ~1 h _1 Mpc indicating the transition 
between the dominance of the one- and two-halo term in the halo model. 



where A is the amplitude of the correlation function evaluated at 
8 = 1 radian, T is the Gamma function, H z is the Hubble parame- 
ter at redshift z, \z is the co-moving radial distance to z and dn/dz 
is the redshift distribution of the population, normalised to unity. 
We assume the redshift distribution of HAEs in our narrowband se- 
lection is set by the H2S(1) filter profile, which can be described by 
a Gaussian function centred at z — 2.233, with full width at half 
maximum of Sz = 0.03 (e.g. Sobral et al. 2010). Here we make 
the further assumption that we are 100% incomplete in the wings 
(>FWHM) of the H2S(1) transmission function, and therefore de- 
fine the redshift distribution to be: 



dn/dz =\ noex P( 



o 



for \z — z c 
for \z — Zc 



< 0.015 
> 0.015, 



(8) 



where z c = 2.233 and a = 0.0126 and no is the normalisation 
constant. This form of the redshift distribution attempts to account 
for the fact that there is a (luminosity dependent) bias in our selec- 
tion in favour of HAEs with observed Ha emission closer to the 
peak transmission of the filter. We are currently engaged in spec- 
troscopic follow-up projects to properly characterise the redshift 
distribution of HAEs in our sample. Adopting this dn/dz in equa- 
tion 6, we find rrj = 3.7 ± 0.3 h^ 1 Mpc, which is similar to that 
derived in Geach et al. (2008) for a smaller sample. Note that the 
effect of applying a different redshift distribution on ro corresponds 
to a scaling in amplitude of J dz a (dn a /dz a ) 2 / J dzb(dnb/dzt) 2 . 

Contamination by non-HAEs reduces the amplitude of the cor- 
relation function by a factor (1 — f) 2 where / is the contamination 
fraction. As described in §2 it is likely that the contamination rate 



is of order 10%, corresponding to a factor 0.8 attenuation in the 
clustering amplitude. We do not apply a correction to our measured 
parameters here until a more accurate estimate of the contamination 
rate is obtained from our spectroscopic survey. 



4.1 Estimating the bias and characteristic halo mass of 
HAEs at z = 2.23 

The autocorrelation function of galaxies can related to that of the 
underlying dark matter via the linear bias: (dm = b 2 £ g . This arises 
because galaxies forming in the peaks of a Gaussian random fluctu- 
ation field will be clustered in a way that is biased to that of the dark 
matter. This bias will depend on the details of galaxy formation rel- 
ative to the underlying matter density. It is therefore an important 
part of our understanding of a particular galaxy population. 

With an estimate for (dm, we can fit the observed projected 
angular correlation function for the scaling b 2 . To evaluate (dm 
(or rather, its projection, ujom), we follow the method described 
by Hickox et al. (2012) and others (e.g. Myers et al. 2007; Coil 
et al. 2008) that we briefly review here. First, the projected angu- 
lar correlation function of dark matter is derived by calculating the 
nonlinear dark matter power spectrum, A 2 ^- L (k, z), using the code 
HALOFIT (Smith et al. 2003), assuming V = Q m h = 0.21 as the 
slope of the initial fluctuation power spectrum. The projected cor- 
relation function loom{9), averaged over the redshift distribution 
of the HAEs, can then be calculated following Myers et al. (2007, 
equation A6), which projects the power spectrum into the angular 
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Table 1. Summary of model fit parameters to the observed clustering of HAEs at z = 2.23. Masses are in units of H~ 1 Mq and uncertainties reflect 1<t range. 



Power-law a 




r /(/i _1 Mpc) 
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Dark matter 6 
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log 10 (M c ) 


Iog 10 (M eff ) 


Halo occupation distribution 
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3+ ' 7 
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= (r/ro) -1 ' 8 fit for scales 6 < 600". 
6 ?gal = ^g a i?DM- Mass is the 'characteristic' halo mass for the quoted bias. 
c See section 4.3 for further details. Note: <7i og M = S iogM , = 1, M c = M, 



correlation function using Limber's equation. The dark matter cor- 
relation function is shown in Figure 1 . 

We fit for the b 2 scaling that minimises a \ 2 fit with the ob- 
served HAE angular correlation function, yielding &hae = 2.4 ± 
0.1, with reduced \ 2 jv = 0.8, formally slightly poorer than the 
power law fit. The characteristic halo mass M is related to the bias 
through the parameterisation b = f(y) where v is the ratio of the 
critical threshold for spherical collapse to the r.m.s. density fluc- 
tuation for a mass M: v = 5 c /a(M). The function f[y) for a 
given cosmology is usually derived by fitting a form to the out- 
put of A/-body simulations; here we apply the function of Tinker et 
al. (2010) (assuming halos are all 200 times the mean density of 
the Universe). The Tinker et al. fitting function is similar to that of 
Sheth, Mo & Tormen (2001), but predicts slightly larger b for large 
v and slightly lower b for small v (asymptoting to constant b for 
low mass halos, and scaling as a power law for high masses). For a 
bias of &hae = 2.4t°i,we calculate a characteristic halo mass of 
log 10 (A/ h /[ft _1 M Q ]) = 11.7 ± 0.1 at 2 = 2.23. This character- 
istic Mhaio corresponds to the top-hat virial mass (see e.g. Peebles 
1993 and references therein), in the simplified case in which all ob- 
jects in a given sample reside in halos of the same mass. We note 
that this mass is approximately equal to the 'effective' halo mass 
derived from full HOD modelling, as discussed in §4.3, but differs 
from some prescriptions in the literature which assume that sources 
occupy all halos above some minimum mass. Given the halo mass 
function at z ~ 2 (e.g. Tinker et al. 2008) the derived minimum 
mass is typically a factor of ~2 lower, for the same clustering am- 
plitude, than the characteristic mass quoted here. 

4.2 Comparison to models of galaxy formation 

GALFORM (Cole et al. 2000) is a successful semi-analytic model, 
or rather a suite of models, that describe galaxy formation using 
simplified prescriptions for the radiative cooling of gas within dark 
matter halos, star formation and feedback (both through supernovae 
and active galactic nuclei [AGN]), along with a hierarchical compo- 
nent for growth set by the merger histories of the halos the galaxies 
occupy. The latter is achieved by coupling semi-analytic models to 
large iV-body simulations in which halos (usually defined as re- 
gions within which the matter density is A = 200 x p(z)) can be 
identified and tracked (see Merson et al. 2012 in preparation). 

The main criticism levelled at semi-analytics is that they are 



over-complicated with too many free (and uncertain) parameters. 
The counter argument is that galaxy formation is inherently com- 
plex, and semi-analytics serve as a tool for exploring the physics 
shaping the evolution of the galaxy population below the resolution 
that can be achieved in numerical simulations; these models can 
be refined as empirical results improve. Furthermore, semi- analytic 
models are successful in reproducing many of the key features of 
the galaxy population, including the shape and evolution of the lu- 
minosity functions of stellar mass (see Baugh 2006 for a review). 

We consider the clustering properties of HAEs within the Mil- 
lennium Simulation (Springel et al. 2005), generated from three dif- 
ferent GALFORM simulations: Bower et al. (2006; B06), Font et al. 
(2008; F08) and Lagos et al. (2011, Lll). The B06 model, which 
includes a recipe for AGN-driven feedback in massive halos, suc- 
cessfully reproduces key features of the local and distant galaxy 
population, including the black hole-bulge mass scaling at z = 0, 
the shape of the bj- and i\"-band luminosity functions at z = 
(successfully reproducing the exponential turn down at high lumi- 
nosities) and the evolution of the stellar mass function of galaxies 
out to z ~ 4.5. Orsi et al. (2010) studied the clustering of HAEs 
in the B06 model to assess the relative merits of different selection 
techniques for the construction of future galaxy redshift surveys. 
The F08 and Lll models are based on B06, with the key improve- 
ments that: (a) F08 includes a more realistic prescription for gas 
cooling within satellite galaxies which orbit within massive halos, 
and (b) Lll implements a pressure-based star formation law fol- 
lowing Blitz & Rosolowsky (2006), and a more refined model of 
the ISM. We refer the reader to the respective articles that describe 
each model in detail. The selection of HAEs in GALFORM is de- 
scribed by Orsi et al. (2010). 

The predicted galaxy correlation functions are effectively 
identical in slope and amplitude in all three models, with r = 3.8- 
4.2 h" 1 Mpc when the amplitude of the real space correlation func- 
tion is equal to unity £(r) = 1. The similarity between the pre- 
dictions is perhaps not surprising, given the similarities in the un- 
derlying galaxy formation models. This is in reasonable agreement 
with the amplitude of the real space correlation function estimated 
from the de-projection of the angular correlation function of real 
HAEs. In Figure 2 we compare £(r) measured directly from the 
simulations to our power law and scaled dark matter models of 
the real HAE angular correlation function. As Figure 2 shows, 
both power law and scaled dark matter fits to the data almost ex- 
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Figure 2. A comparison of the real space correlation function of simulated 
HAEs from GALFORM with L Hq > 10 42 ergs -1 cm -2 at z = 2.2 to 
fits of the observed angular clustering (Fig 1). The lines show three model 
fits to the measured angular correlation function: (a) a simple power law 
f(r) = (r/ro)- 7 (with 7 = 1.8), (b) f(r) = b 2 £ D M and (c) the HOD 
fit (see §4.3). On scales ^0.5 Mpc the models predict HAE clustering that 
is in reasonable agreement with the amplitude of the clustering measured 
in the observations, but the semi-analytic models predict less power at low 
separation compared to the data (this is also apparent in the Bower et al. and 
Font et al. models which we do not show here for clarity). 



actly match the clustering strength of GALFORM HAEs on scales 
r > 0.5 h^ 1 Mpc. GALFORM has less clustering than scaled dark 
matter at smaller (single halo) scales. We explore this in the next 
section, with a more sophisticated model of the clustering of HAEs 
than simple using a scaled version of the dark matter correlation 
function. 



4.3 A Halo Occupation Distribution model for HAEs at 
z = 2.23 

4.3.1 Overview 

A basic tenet of our current picture of the formation of galaxies, 
and their relationship to dark matter, is that galaxies inhabit dark 
matter halos either as 'central' galaxies close to the density peak, 
or 'satellites' distributed according to some radial density profile 
(Navarro, Frenk & White 1997). Intuitively, the number of satel- 
lites a halo can accommodate increases with halo mass; illustrated 
in the real Universe by massive clusters of galaxies, where the cen- 
tral galaxy is usually a massive elliptical surrounded by hundreds or 
thousands of lower-mass cluster members. However, although the 
occupation number might scale with halo mass in the stellar mass 
limited case, the exact selection of galaxies in a given sample will 
affect the observed halo occupation distribution. A halo occupation 
distribution (HOD) model parameterises the probability distribu- 
tion that describes the likelihood that a halo of mass M hosts on 
average galaxies (see Cooray & Sheth 2002 for a review). As 
the projected clustering and number density of a galaxy popula- 
tion (or populations) will depend on the form of the HOD, we can 
use the observed clustering data to try to constrain models of the 
halo occupation of HAEs. Critical to this approach is the parame- 



terisation of the HOD; namely the functional form assumed for the 
probability of finding a central galaxy, or N satellites in a halo of 
mass M. 

We follow the methods of Wake et al. (2008, 201 1 [Wl 1]) to 
construct a halo model, and refer the reader to Appendix B of W 1 1 
for a thorough description. In brief, one must parameterise the halo 
model by defining functions for the mean number of galaxies in a 
given halo, (N\M). Given the good agreement between the clus- 
tering amplitude measured from the semi-analytic models and the 
data, we adjust our fiducial halo model to match the simulations; 
here we have the luxury of the direct prediction of the HOD from 
the model. In Figure 3 we show the HOD of 1.45 x 10 7 dark mat- 
ter halos in the Millennium Simulation, populated with HAEs using 
the GALFORM model. We show the HAE HOD for three luminosity 
cuts, L Hq > 10 41 , 10 42 , lO^ergs" 1 . 

The star-forming galaxy HOD has some important differences 
from typical mass limited HODs (cf. Zheng et al. 2007, Wl 1) that 
are worth noting. First, at the lowest halo masses, the central galaxy 
distribution is approximately Gaussian, with a characteristic host 
mass Af m in and scale a. At halo masses M ^ M m i n + a the distri- 
bution of centrals becomes approximately flat, similar to the mass 
limited case though does not necessarily asymptote to (7Y C |M) = 
1. One could therefore envisage a simple two component model 
for the central HAE halo occupation, with a Gaussian distribution 
plus step function. At low Ha luminosities, Lho ~ 10 41 ergs -1 , 
above halo masses of ~10 11 /i~ 1 Mq almost every halo hosts a 
central that is a HAE. As the luminosity limit is increased, the low- 
mass Gaussian component becomes more prominent (peaked) and 
shifted to higher halo masses, but with the occupation number de- 
clining with increasing Ha at all halo masses. 

The decline in occupation number within increasing Ha lumi- 
nosity is in part due to the form of the luminosity function, but the 
shape of the central HOD is likely to be driven by (a) the stellar 
mass and star formation history of central galaxies as a function of 
halo mass and (b) differences in the star formation efficiency as a 
function of halo mass (e.g. the cooling rate onto central galaxies). 
It is also important to consider that Ha emission can also result 
from nuclear activity which might be important for bright, central 
HAEs in massive halos. The satellite distribution is similar to the 
mass-limited case, with a smooth lower-mass cut-off in occupa- 
tion and (N S \M) scaling as a power-law at large M (Kravtsov et 
al. 2004; Zheng et al. 2005). There is a simple luminosity depen- 
dence, with the number of satellites declining as Lhq increases. 
The decline in satellite occupation at all mass scales for the more 
luminous HAEs is a natural outcome of the shape of the luminosity 
function, with Lho, = 10 43 ergs _1 probing exponentially declin- 
ing L > L* HAEs at this redshift (Geach et al. 2008; Sobral et al. 
2012). 



4.3.2 A HOD model for Ha emitters 

The central HAE distribution can be adequately described by two 
components: 



(N C \M) = F c s (l — F C A ) exp 



log(M/M c 



2rr 2 

zo log M 



+ F C A 



1 + erf 



log(M/M c 



(9) 



ClogM 

where F^' B are normalisation factors ranging from 0-1. The first 
component describes the Gaussian distribution of centrals around 
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Figure 3. Halo occupation distribution (HOD) model of HAEs at z = 2.2 predicted by GALFORM, where (JV ga i \M) denotes the mean number of galaxies 
in a halo of mass M. We show the HODs of central and satellite galaxies with Ha luminosities of (left to right panels) Ln a > 10 41 , 10 42 , 10 43 ergs -1 
(points). The total number of halos (that occupy the Millennium Simulation volume) in this model is 1.45 X 10 7 (error bars are Poisson). There is a clear 
luminosity dependence to the HOD, with the occupation number dropping at all halo masses with increasing Ha luminosity. The lines corresponding to 
'central', 'satellite' and 'total' show the best fit to the points extracted from GALFORM using our parametric HOD described in §4.3. At all luminosities we 
can fit the HOD with same parametric form, and we adopt this model in our fitting of the observed projected correlation function. 



halos of average mass M c , and the second component describes the 
high mass distribution, which we take as the standard mass-limited 
step function form (Zheng et al. 2007). The parameter cri og m de- 
scribes the typical mass range of halos with HAEs as centrals for 
the Gaussian component; the exact value of cri og m in the second 
component is not critical, and so we decide to fix it to the Gaussian 
width. Similarly we set the step function low mass cut-off to be M c . 
As shown in Figure 3, this four parameter model provides a good 
description of the model HOD at 10 41 < (Lna/ergs^ 1 ) < 10 43 , 
the pertinent range for our analysis. 

The number of satellite galaxies is described by a smoothed 
step function similar to the central galaxy distribution for mass lim- 
ited samples (Zheng et al. 2007), but with the added component of 
a power law scaling at masses larger than the critical mass, M m i n : 



(N a \M) = F s 



1 + erf 



log(M/M n 

<5log M 



M 
M mi , 



(10) 



The parameter F s is the mean number of galaxies at the transition 
mass A'/ m i n (the characteristic mass above which halos can contain 
satellite HAEs). The parameter a controls the abundance of star- 
forming satellites for M > A/ m i n . This functional form provides a 
more satisfactory fit to the model satellite distribution at low masses 
allowing a more gradual cut off to the power law than is assumed 
in the standard stellar mass limited case (e.g. Wake et al. 20 1 1). We 
make no restrictions as to whether a central HAE is required for the 
hosting of satellites, so the mean total number of galaxies in a halo 
of mass M is 



(N\M) = (N C \M) + (N S \M) . 



(11) 



There are up to eight free parameters in this HOD. However 
we choose to fix some in our modelling, given the size of the current 
sample. The exact smoothing of the satellite low mass cut-off is not 
particularly important, in that satellites close to the threshold (in 
the model) do not contribute significantly to the halo occupation. 
Therefore we fix Si og M = a\ og M- Although we do not require 
a halo to contain a Ha emitting central in order to host satellite 
HAEs, we also constrain the satellite threshold mass as M m - m = 



M c . Finally, we fix the slope of the satellite distribution to a = 
1; this is close to the model fit across the full luminosity range 
shown in Figure 3, and is in agreement with the value found for 
mass limited samples. Thus, our model has five free parameters. 
Note that having a consistent model that scales with Ha luminosity 
is of benefit to our analysis, given the possible uncertainties in the 
fidelity of observed and simulated Ha fluxes. 

With (N\M) defined, the number density of galaxies is given 
by the integral of the halo mass function n(M): 



dMn(M) (N\M) , 



(12) 



and this can be used as an additional constraint in the fitting of the 
HOD, provided the number density of galaxies is known, although 
it is often difficult to produce fits that simultaneously match the 
clustering and abundance, e.g. Quadri et al. (2008). Here we use the 
latest parameterisation for n(M) from Tinker et al. (2010). With 
the halo model set up, f (r) is defined (Cooray & Sheth 2002), and 
this can be projected to the angular correlation function u)(6) using 
Limber's equation. 

We can also define other parameters that are useful to summa- 
rize the halo model: the satellite fraction, 



,/sat — 



J dMn(M) (N C \M) (N S \M) /n B , 



(13) 



which measures the fraction of galaxies in the sample that are satel- 
lites; the effective halo mass: 



M eS = J dMMn(M) (N\M) /n g , 
and the effective galaxy bias 

6 g = J dMn(M)b(M) (N\M) /n g , 
where b(M) is the bias for a halo of mass M. 



(14) 



(15) 
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4.3.3 HOD fitting results 

We assert from the outset that, with the current data (i.e. relatively 
small sample number), the interpretation of the results of this HOD 
analysis must be taken with caution. Given the degeneracies in- 
volved, the results should only be used as an early guide. Never- 
theless, the HOD provides an elegant framework within which to 
discuss the observed clustering, and we examine the results here. 

The angular correlation function derived from the HOD de- 
scribed above is fit to the data, including the full covariance ma- 
trix. As in Wll, minimisation is achieved by using a Markov 
Chain Monte Carlo technique, which allows us to efficiently ex- 
plore the parameter volume. The best fit lj(9) is shown in Figure 

1, with a reduced % 2 l v = 0.7, again indicating that our data is 
too coarse to constrain the model. Although we present the best 
fitting model here, there are large degeneracies in the current halo 
model that the data cannot resolve. This means that the key halo 
parameters described in §4.3.2 are only poorly constrained. The 
difference between the HOD model and the real space correlation 
function measured from GALFORM simulations is shown in Figure 

2. Most of the parameters in equations 9 and 10 have very poor 
constraints, For example, the normalisation factors are effectively 
unconstrained, and the 68% confidence interval for the minimum 
halo mass hosting centrals (and the minimum mass for satellites) is 
large, M c ~(0.1-13) x 10 12 h _1 M Q and the la upper limit of the 
satellite fraction / sa t £ 0.46. The normalisation factors F^' B are 
effectively unconstrained. 

There are clearly indications of serious degeneracies in the 
model that cannot be resolved with the current data and are a com- 
mon problem for samples of galaxies where just a small fraction 
of the population are detected. Only the average bias and mean 
halo mass are reasonably well constrained, with b — 2. 4^4 and 
M cff = (1.3to g) x 10 12 h -1 M Q , in agreement with what was 
found for the scaled dark matter fit in §4.1. We summarise the re- 
sults from the HOD fit in Table 1 , along with the results from the 
power-law and dark matter fits. 



5 DISCUSSION 

5.1 The fate of HAEs at z = 2.23 

The clustering amplitude of z — 2.23 HAEs is similar to other 
star-forming populations at high-z. Adelberger et al. (2005) present 
a clustering analysis of U n GlZ (BX/BM) selected star-forming 
galaxies at 1.4 < z < 3.5 and derive a correlation length of 
ro ~ 4ft -1 Mpc across this redshift range, and argue that, at 
2 ~ 2.2, star-forming (BX) galaxies with M* « 10 10 M© re- 
side in dark matter halos of mass ~10 12 A/ Q . Hayashi et al. (2007) 
present a clustering analysis of star-forming 'sBzK' selected galax- 
ies (Daddi et al. 2004) at z ~ 2, which are a similar population 
to the broad-band BX selected galaxies described above, finding 
ro = 3.2±°;^ h" 1 Mpc and typical halo masses of 2.8 x 10 n M s . 

The average stellar mass of HAEs in our sample is 
\og(M* / Mq) = 9.4 (calculated from stellar population fits to 
the homogenised UV-optical-near-IR photometry using the tem- 
plates of Bruzual & Chariot 2007, including the thermally pulsating 
Asymptotic Giant Branch population, Sobral et al. 2011). The key 
improvement made here is that our selection is far more exclusive 
than broad band selections, with the narrowband technique corre- 
sponding to a nearly pure SFR selection over a very narrow redshift 
range. This has the effect of minimising contamination (important 
for an accurate measurement of the clustering amplitude for a spe- 




Shioya et al. (2008) 
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Figure 4. Comparison of the correlation length of HAEs and star-forming 
galaxies since z = 2.2 derived from de-projected angular clustering mea- 
surements. We compare the measured values to the predicted halo mass 
hosting galaxies with correlation length ro for our cosmology. We dis- 
tinguish between measurements made from samples of HAEs selected in 
narrow-band and more general star-forming galaxies selected in broad-band 
surveys (the latter have much broader redshift distributions). Note that evo- 
lutionary trends are hard to measure in this plot, given that the low red- 
shift surveys generally probe lower luminosity systems, and there is ob- 
served to be a strong correlation between clustering strength and lumi- 
nosity (i.e. SFR). Indeed, Sobral et al. (2010) show that ro increases to 
ro ~ 5/i _1 Mpc at z = 0.84 when Lhq ^ 10 42 ergs -1 are considered. 
In summary, 'normal' star-forming galaxies with SFR ~ 1—100 Mq yr — 1 
have been hosted by dark matter halos with 10 10 % M h <C 10 12 h~ 1 M Q 
since z = 2.2, with more luminous and massive systems residing in more 
massive halos at all epochs. 

cific population) and the tomographic nature of the selection should 
improve the contrast of scale dependent features in the projected 
clustering. 

Hayashi et al. note the clear stellar mass (A"-band luminos- 
ity) dependence to the clustering strength, indicating that the de- 
scendants of sBzK galaxies could range from sub-Milky Way mass 
halos to halos similar to rich clusters. Sobral et al. (2010) also 
find that, when split by stellar mass and Ha luminosity, a clear 
increase in the derived correlation length was found for HAEs at 
z = 0.84, such that more massive and luminous (i.e. high SFR) 
galaxies reside in more massive dark matter halos. The 'varied 
fates' of star-forming galaxies at z — 2 has been discussed by 
Conroy et al. (2008) who examine the evolutionary history of star- 
forming galaxy hosting dark matter halos in A'-body simulations, 
finding that generically selected star-forming galaxies at z ~ 2 do 
not evolve into any single class of galaxy by z — 0. The number 
density of the descendants of model z ~ 2 star-forming galaxies 
at z = drops by a factor of two due to the merging of descen- 
dants in the interval < z < 2. Of the remaining galaxies that 
did not merge, 70% evolve into central galaxies within halos of 
M h k, 10 12 /i _1 M Q by z = 0. Central galaxies at z = corre- 
spond to >L* systems, whereas the star-forming galaxies that are 
destined to become satellites by z = are generally lower-mass 
systems owing to the slower/halted rate of stellar mass growth ex- 
pected for sub-halos orbiting within massive halos (i.e. a decline in 
the cooling rate and potential expulsion of gas, q.v. §4.2). Gonzalez 
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et al. (2011) find a similar result for submillimeter selected galax- 
ies within GALFORM, with the descendants of these high-z star- 
forming galaxies evolving into z — galaxies with stellar masses 
~ 10 10_la /i _1 Af Q . 

Although we expect the HAEs in our sample to evolve into 
a range of galaxy types, we can estimate the halo mass of the de- 
scendants of the average HAE in our sample - i.e. those hosted by 
halos with the 'characteristic' mass found in our clustering anal- 
ysis. Assuming M eff = (1.3±g"|) x 10 12 h _1 M @ at z = 2.23 
we use the median halo mass growth rate from Fakhouri, Ma & 
Boylan-Kolchin (2010) to estimate that by z = the average HAE 
is destined to reside in a halo of mass M h = 2-5 x 10 12 /i _1 M o . 
Thus, HAEs are an important population to study in the context of 
understanding the ecology of 'typical' galaxies in the local Uni- 
verse, although as described above, there are likely to be important 
mass and luminosity dependencies in the exact evolutionary trajec- 
tory of HAEs (as hinted at by Figure 3 and 4), which our current 
data cannot resolve. 



5.2 Comparison with other Ha surveys at low redshift 

Sobral et al. (2010) present a clustering analysis of HAEs de- 
tected in HiZELS at a redshift of z = 0.84 (narrow ./-band se- 
lection, probing to lower Ha luminosities than the present survey) 
and find a strong luminosity dependence to the clustering strength, 
2 %r % 5ft" 1 Mpc for 41.6 < log(L Ha /ergs _1 ) < 43.2, with 
the clustering strength increasing with luminosity (similar to the 
trend seen in other samples, as described above). Our sample is 
too small to split into luminosity bins and retain sufficient signal- 
to-noise in the clustering measurement. At an equivalent luminos- 
ity limit to the one used in our analysis, the clustering strength of 
HAEs at z = 0.84 is similar to that at z = 2.23, indicating only 
weak evolution in the clustering properties of star-forming galax- 
ies with SFR> 10 Mq yr" 1 over this range. Shioya et al. (2008) 
and Nakajima et al. (2008) present clustering analyses for HAEs at 
z — 0.24 and z = 0.4 respectively, finding correlation lengths of 
~1.5— 2 h^ 1 Mpc. However, those studies probe fainter HAEs than 
our sample contains, and therefore it is difficult to assess any red- 
shift evolution in the clustering properties of HAEs to these later 
epochs given the expected strong luminosity dependence of ro- 

We summarise this comparison in Figure 4, where we com- 
pare the derived correlation length of samples of narrow-band se- 
lected HAEs and the more generic broad-band selections of star- 
forming galaxies described above. The broad range in luminosity 
limits (Shioya et al. 2008 probe Ha luminosities over two orders 
of magnitude lower than our sample) in the ro-z plot mask any 
evidence of evolution in the clustering of star-forming galaxies. In- 
deed, the characteristic luminosity of HAEs is itself a strong func- 
tion of redshift, with log(L*/erg s _1 ) = 0.45z + 41.87 since 
z = 2.23 (Sobral et al. 2012). It is clear however, that 'typical' 
star-forming galaxies (i.e. those close to L* and not in the ultralu- 
minous class, such as submillimeter selected galaxies, see Hickox 
et al. 2012) have, on average, been hosted by dark matter halos with 
10 10 & M h £ 10 12 /i _1 M© since z = 2.2, with the amplitude of 
clustering decreasing for less luminous and lower mass systems. 

Figure 4 presents an average representation of the clustering 
properties of star-forming galaxies. In reality, HAEs are expected 
to reside in halos with a range of masses (as modelled by our HOD 
for example), and this will have important consequences for their 
fate. In the next section we illustrate this with an example from our 
data - an apparent over-density of HAEs in the COSMOS field, 
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Figure 5. A potentially massive halo in the COSMOS field, blindly de- 
tected as an over-density of HAEs in HiZELS. The large points show HAEs 
meeting the selection criteria used in the present study (smaller points are 
HAEs with lower line fluxes). The colour background and contours show 
the smoothed density contrast, S = (p — (p))/(p), clearly indicating a sig- 
nificant peak in the mean surface density. Interestingly, this structure con- 
tains a Ha emitting z = 2.23 QSO close to the peak (cross symbol); such 
active systems are often used as 'signpost' objects around which to search 
for over-dense structures. HAEs in this structure exemplify contribution of 
star-forming satellites producing power in the correlation function at low 
angular separations. 



perhaps representing star-forming galaxies tracing a rather massive, 
rare dark matter halo. 



5.3 A comment on satellite HAEs and cosmic variance: the 
detection of a over-dense structure in the COSMOS field 

The measured correlation function implies that satellites play a 
non-negligible role in the small-scale clustering power. In the halo 
model described in §4.3, massive halos with large numbers of 
bright Ha-emitting satellites are rare objects, as dictated by the 
luminosity and halo mass function. However such systems might 
be detectable in large surveys such as ours as local over-densities 
in the surface density of HAEs. We have detected such a system in 
the COSMOS field. 

We have evaluated the local density contrast across the field 
by first calculating a simple local density measure p — A/nr\, 
where is the angular distance to the fourth nearest HAE from 
an arbitrary point. This is normalised to give the density contrast: 
S = (p — (p) ) / (p) , where (p) is the mean surface density of HAEs 
across the field. We evaluate S across a grid, and then smooth this 
with a Gaussian kernel of FWHM equivalent to 5 co-moving Mpc. 
The peak density contrast is 6 = 17 at 10 h 00 m 50 s , +02°00'53". 
We do not detect a similar structure in the UDS field, implying 
the sky density of environments of similar mass is of the order 
one per two square degrees. Systems such as this illustrate the im- 
portance of taking into account cosmic variance in clustering mea- 
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surements of HAEs. As Figure 1 shows, the small-scale clustering 
power in the angular correlation function is dominated by the COS- 
MOS field, and this local over-density is likely to be a dominant 
contributing factor, with the one-halo term boosting uj{6) at scales 
below 1 Mpc. The cosmic variance uncertainty is encoded into the 
delete-one jackknife method we have employed, since the bulk of 
the over-density is easily encompassed by one of the sub- volumes. 

Figure 5 shows the sky plot of HAEs around the peak of the 
over-density, including a representation of the smoothed density 
field. Interestingly, the peak encompasses the z = 2.2396 quasar 
SDSS J100051.92+015919.2 (Prescott et al. 2006), which is itself 
a HAE (and included in our sample). Extremely luminous galaxies 
such as quasars and radio galaxies are often used to seek out dense 
environments, relying on the fact that these extreme, but rare, active 
galaxies are likely to be highly biased tracers of the matter field and 
therefore reside in massive halos (Ellingson et al. 1991; Clowes & 
Campusano 1991; Bower & Smail 1997; Miller et al. 2004; Boris 
et al. 2007; Hatch et al. 201 1; Matsuda et al. 201 1). In this case the 
COSMOS structure was blindly detected and turns out to harbour 
a quasar, lending support for the approach of imaging the fields 
of active galaxies with narrowband surveys to discover such (rare) 
environments. 



6 SUMMARY 

We have presented an analysis of the clustering properties of 370 
Ha emitting galaxies at z — 2.23, selected in two, independent, 
degree-scale fields as part of the HiZELS survey. Using a series of 
increasingly sophisticated models of the clustering, we find: 

(i) The average correlation function can be broadly modelled as 
a power law, with slope f3 = 0.8. Although there are clear devi- 
ations from the simple power law on all scales, the normalisation 
of the power law fit provides an adequate estimate of the physical 
correlation length of HAEs ro = 3.7 ± 0.3 ft -1 Mpc, similar to 
other star-forming populations at this redshift. We find that the lat- 
est semi-analytic models of galaxy formation predict a correlation 
length that is in good agreement with the measured value. 

(ii) The shape of the observed correlation function is more ac- 
curately reproduced by scaling the projected correlation function of 
dark matter with a bias factor: ojhae = & 2 <^>dm. This is superior 
to the simple power law as it is a better description of the varia- 
tion of the power in the correlation function across the full range 
of measured scales, 0.1 < r < 10 h~ x Mpc. The best fitting value 
HAEs is &hae = 2.41^2. This can be related to a characteristic 
halo mass, which we find to be log(Mh/[/i -1 M ]) = 11.7 ±0.1. 

(iii) Our final model attempts to fit the HAE clustering using 
a halo occupation distribution (HOD) model. To parameterise the 
occupation of central and satellite HAEs in dark matter halos, we 
turn to the semi-analytic models for motivation (which predict the 
HOD), given the good agreement between model described above. 
Although the HOD is poorly constrained by the current data (with 
clear degeneracies resulting in multiple acceptable fits to the an- 
gular clustering), we derive an average bias and characteristic halo 
mass in good agreement with those derived from the scaled dark 
matter correlation function, with b — 2.41q'| and effective halo 
mass M eS = (1.3tg;|) x lO 12 /^ 1 M . 

(iv) Finally, we report on the detection of a significant localised 
over-density of HAEs in the COSMOS field. Interestingly, this 
structure encompasses a z = 2.23 QSO, which is itself a HAE. 
It is clear from the clustering analysis that cosmic variance in HAE 



surveys remains important on ~1 deg 2 scales, especially in the fluc- 
tuations expected in the small scale clustering amplitude. The HAE 
structure is likely to trace a relatively massive halo, M h ~ 1O 13 M 
with a high satellite occupation number, and could be destined to 
evolve into a large group or cluster of galaxies by z = 0. 

Future high redshift Ha surveys with improved statistics over 
wider fields will be able to explore halo models of HAEs in further 
detail. Our current result represents a first step in this direction, and 
despite the limited information we can extract from the clustering 
models, it is clear that disentangling the relative role of central and 
satellite star formation in massive halos at high redshift is an im- 
portant component of our understanding of the efficiency of stellar 
mass assembly as a function of halo mass. Multi-epoch Ha surveys 
such as HiZELS will be essential for examining evolutionary trends 
in the clustering properties of star-forming galaxies at the peak era 
of galaxy formation, and we aim to investigate this in a forthcoming 
paper. 
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